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Abstract 

Networks are useful for describing systems of interacting objects, where the nodes represent the 
objects and the edges represent the interactions between them. The applications include chemical 
and metabolic systems, food webs as well as social networks. Lately, it was found that many of 
these networks display some common topological features, such as high clustering, small average 
path length (small world networks) and a power-law degree distribution (scale free networks). The 
topological features of a network are commonly related to the network's functionality. However, 
the topology alone does not account for the nature of the interactions in the network and their 
strength. Here we introduce a method for evaluating the correlations between pairs of nodes in the 
network. These correlations depend both on the topology and on the functionality of the network. 
A network with high connectivity displays strong correlations between its interacting nodes and 
thus features small-world functionality. We quantify the correlations between all pairs of nodes in 
the network, and express them as matrix elements in the correlation matrix. From this information 
one can plot the correlation function for the network and to extract the correlation length. The 
connectivity of a network is then defined as the ratio between this correlation length and the average 
path length of the network. Using this method we distinguish between a topological small world 
and a functional small world, where the latter is characterized by long range correlations and high 
connectivity. Clearly, networks which share the same topology, may have different connectivities, 
based on the nature and strength of their interactions. The method is demonstrated on metabolic 
networks, but can be readily generalized to other types of networks. 

PACS numbers: 89.75.Hc,89.75.Fb,89.75.Da 
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I. INTRODUCTION 



A network, or graph, consists of a set of nodes, from which selected pairs are connected 
by edges. Such mathematical constructions provide a useful description for systems of 
interacting objects. More specifically, network concepts are used in the analysis of chemical 
and metabolic systems as well as food webs and social networks. In recent years, there 
has been much progress in the analysis of the topology of these networks. The network 
topology can be characterized by features such as the number of nodes, J, and the average 
degree (k), namely the average number of edges that are connected to a node. A more 
detailed description of the network topology is given by the degree distribution, P(k), which 
is the probability that a randomly selected node has exactly k edges. Another important 
topological feature measures the tendency of a network to support the formation of cliques. 
A clique is a fully connected set of nodes, namely each pair of nodes in such a set is connected 
by an edge. The tendency of a network to form cliques can be characterized by the clustering 
coefficient [l, 2, 3, J]. Roughly speaking, when a network has a high clustering coefficient it 
is considered to be highly connected. A low clustering coefficient implies that the network 
is only loosely connected. 

Networks exhibit a unique metric, in which the distance, d, between any two nodes is 
given by the minimal number of edges one has to cross in order to pass from one node to 
the other. In some cases, the distance can be used as a measure for the connection between 
a pair of nodes. This is based on the assumption that two directly reacting nodes {d — 1) 
strongly affect each other, whereas distant nodes weakly affect one another. The average 
path length in a network, (d), is obtained by averaging over the distance between all pairs 
of nodes in the network. The parameters defined above were evaluated for random graphs 



and their dependence on J and (k) was found 
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networks shows that they are very different from random graphs |9J. In realistic networks 
it is common to find surprisingly low average path lengths, and relatively high clustering 
coefficients. In many cases the degree distribution follows a power law form, rather than 



the Poisson distribution which is the signature o 
found to appear in social networks l| 



we i 



29 



30 



20 



21 



22 



23 



24j | . ecological networks [25 



random networks. These features were 
^3, Q, 3, the world wide 
and metabolic networks 




31[. 



2 



While the topological properties of realistic networks have been elucidated, the implica- 
tions on the functionality of these networks are not fully understood. The small average path 
length and the high clustering of many realistic networks, render them as small world net- 
works. At first glance, the small world characteristics imply that realistic networks function 
as highly connected systems. Indeed, one expects that if the distance between two nodes is 
small, the correlation between them will be strong. For instance, in the case of a metabolic 
network, the concentrations of interacting proteins will strongly depend on each other. A 
perturbation in the concentration of one protein is likely to affect the concentration of the 
other. This might lead to the conclusion that small world networks are highly susceptible 
to local perturbations, as almost all the nodes are just a short distance away. The problem 
with this topological analysis, is that it does not relate to the specific function of a given 



network or to the strength of the interactions between its nodes [32[ . Consider, for instance, 
a metabolic network and an ecological network sharing the same topology. In what sense 
can these two networks be regarded as similar networks? Even if the two have the same 
topological structure, the nature of their functional behavior is fundamentally different. The 
process of predation may lead to different behavior than the process of chemical reaction 
between proteins. Even two metabolic networks may function differently if the interaction 
strengths in one network are higher than in the other. 

In this paper, we present a method for obtaining the correlation matrix of a given network. 
The elements of this matrix provide the magnitudes of the correlations between pairs of nodes 
in the network. In certain cases the matrix can be used to characterize some of the global 
features of the network's functionality. For instance, it can be used to identify domains 
of high correlations versus domains of low correlations. Another use of the correlation 
matrix is in quantifying the connectivity of a network in a way that accounts both for 
its topology and for the specific processes taking place between its nodes. This method, 
referred to as the network correlation function (NCF) method, enables us to determine 
whether a topological small world (TSW) network will also be a functional small world 
(FSW) network. A network will be regarded as an FSW network if the correlations between 
its nodes are typically high, and thus the state of one node is highly dependent on that of 
the others. Here we apply the method to metabolic networks with various topologies and 
different interaction strengths. In these networks, each node represents a reactant, and is 
assigned a dynamical variable that accounts for the concentration of this reactant. The time 
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dependence of these concentrations is described by a set of rate equations. The equations 
include terms that describe the interaction processes in the given network. They account 
both for the topology and for the functionality of the network. From the solution of the 
rate equations under steady state conditions one can extract the correlation between each 
pair of nodes. In certain cases, networks are found to have a typical correlation length. If 
the distance between two nodes is much higher than this length, the correlation between 
them is negligible. To quantify the connectivity of the network, one compares the correlation 
length with the average path length. In case that the average path length is smaller than the 
typical correlation length, the network will be considered as an FSW network. In this case, 
local perturbations will have a global effect on the network. The FSW network will thus be 
regarded as strongly connected. On the other hand, if the average path length is larger than 
the typical correlation length, the network will be considered as weakly connected. 

The paper is organized as follows. In Sec. II we present the methodology, and demonstrate 
its applicability to metabolic networks. In Sec. Ill we analyze some simple, analytically 
soluble networks, and in Sec. IV we present a computational analysis of a set of more 
complex networks, culminating in an example of a scale free network. The results are 
summarized and discussed in Sec. V. 

II. THE METHOD 

Below we present the NCF method for evaluating the connectivity of interaction networks. 
For concreteness, we focus on the specific case of metabolic networks. It is straightforward to 
generalize the method to other types of networks. Consider a metabolic network consisting 
of J different molecular species, X it i = 1, . . . , J. The generation rate of the Xi molecules 
is gi (s _1 ). Once a molecule is formed it may undergo degradation at a rate Wi (s _1 ). 
Certain pairs of molecules, Xi and Xj, may react to form a more complex molecule Xk 
(Xi + Xj — > X^. In general, the product molecules X^ may be reactive and represented by 
another node in the network. For simplicity, in the analysis below, we assume that the X\. 
molecules are not reactive and thus do not play a further role in the network. We also limit 
the discussion to the case in which a molecular species does not react with itself, namely 
reactions of the form Xi + Xi — > Xk are excluded. 

The reaction rate between the Xi and Xj molecules is given by the reaction rate matrix 
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A. Its matrix elements are dy (s" 1 ), where i,j = 1, 2, . . . , J. Note that for non-interacting 
pairs of molecules a^- = 0. The network topology matrix, A4, is also a J x J dimensional 
matrix, which is defined as follows: = 1 if Xj and Xj react with each other, and = 
otherwise. Let be the distance between the species X, and X, in the metric of the 
network. The average path length is thus 

« = 7(7^3) I>, (i) 

The parameter (d) provides some information as to the connectivity of the network, but 
only in the topological sense. 

In order to account for the functionality of the network we consider the rate equations, 
which take the form 

dn- J 

—jjr = 9i- w i n i(t) ~ a ij n i(t)nj(t), (2) 

3=1 

where rij(t) is the time dependent concentration of the molecule Xj. The first term on the 
right hand side of Eq. accounts for the generation of Xj molecules. The second term 
accounts for the process of degradation, and the third term accounts for reactions between 
molecules. The steady state (SS) solution of the rate equations, rij, can be obtained by 
setting the left hand side of Eq. (j2J) to zero. One obtains 



_9i_ 

cS ' 



(3) 



where w^ s = Wi + ^ ■ a^rij is the effective degradation rate. Our goal is to characterize 
the correlations between the different species around the steady state condition. Roughly 
speaking, we are asking the following question: While at steady state, to what extent does 
a small perturbation in the concentration of the species Xj affect the concentration of the 
species Xj? To this end we define the first order correlation matrix as 



C = — 
13 drij 



(4) 

ss 



which, using Eq. ([3]) takes the form 
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Note that the elements of the first order correlation matrix are non-zero only if the species Xj 
and Xj directly interact with each other. Topologically, this means that the matrix element 
Cij vanishes unless Dij = 1. Indirect correlations between species that are connected via 
a third species are not accounted for (hence the term first order correlation matrix). To 
account for indirect correlations, one has to compute the complete correlation matrix 



G 



drti 



(6) 



ss 



Clearly, the diagonal terms of this matrix must satisfy 



drii 



drij 



(7) 



ss 



for % = 1, . . . , J. For the off-diagonal terms, i ^ j, one can write 



drii 
drij 



ss ® n i 



J 



SS k= i 



dm 



dn k 

SS ^ n 3 



(8) 



ss 



In matrix form, these equations become 




(9) 

= Y^ J k=i C ikG kj (i^j)- 

Eq. ([1]) is a set of J x J coupled linear equations. Their solution provides the complete 
correlation matrix, Gy. 

Typically, one expects the correlation between two species to decay as a function of the 
distance, D^, between them. The rate of this decay provides the correlation length. To 
obtain the correlation function we identify all pairs of species % and j that are separated by 
a distance d from each other. We then average the magnitude of the correlations, \Gij\, over 
all these pairs. The correlation function vs. distance takes the form 

F COI (d) = ^ lG f d ' D ^ (10) 

where d is an integer. The function 5 X)V = 1 if x = y and zero otherwise. Note that in the 
definition of F cor (d) the absolute value of the matrix terms Gij was used. This is because 
certain pairs of species Xj and Xj may be positively correlated, and others may be negatively 
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correlated. In any case, the focus here is merely on the strength of their mutual correlations 
and not on the sign of these correlations. 

To obtain the correlation length, one may fit the function F cor (d) to an exponent of 
the form K exp(— d/do). The distance do is the correlation length. It approximates the 
distance within which strong correlations between different species are maintained. This 
distance is determined by the dynamical processes and by the characteristic rate constants 
of a specific network. It thus accounts not only for the topology of the system, but also for 
its functionality. Finally, we define the connectivity of a network as 

In the limit where do is much greater than the average path length, most of the nodes are 
within the correlation length from one another, and the components of the network are 
highly correlated. The concentrations of different species are strongly dependent on each 
other, and the network is an FSW network. Correspondingly, one obtains that rj 3> 1. In 
case that do is much smaller than the average path length, the effect of a perturbation in the 
concentration of one species decays on average before it reaches most of the other species. 
Perturbations are thus local, and the connectivity of the network is said to be low. While 
topologically, such a network might be considered a small world network, functionally it is 
a loosely connected network. 

III. ANALYTICALLY SOLUBLE NETWORKS 
A. Linear Metabolic Network 

To demonstrate the NCF method we now refer to a set of simple examples, which are 
analytically soluble. Consider a linear metabolic network of J species ( J ^> 1). The species 
Xi, i — 1, ... J, reacts with its nearest neighbors, namely X^i and X i+1 . This network is 
shown in Fig. [TJ For simplicity, we take all the reacting species to have identical parameters, 
namely Qi — g and i«j = w for i — 1, . . . J. Also, = a in case that i — j ± 1, and = 
otherwise. Taking the limit in which the number of species J is very large, we can avoid the 
complexities related to the boundaries of the network. Under these conditions, the steady 
state solution for all the species is the same, enabling us to omit the index i from the steady 
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state concentrations rij. The reaction rate matrix for this network is 



.4 



/ a ... 

a a 
a a 



o\ 






(12) 



... a a 
\0 ... a 0/ 
For a linear network, the average distance between pairs is {d) 
can be approximated by 



(J+l)/3, which for J > 1, 



(13) 



namely, (d) scales linearly with J. The clustering coefficient for this network is zero. Thus, 
from the topological point of view, the linear network cannot be considered a small world. 
The rate equation for the linear metabolic network is 



dn 
~dt 

leading to the steady state solution 



g — wn(t) — 2an 2 (t), 



(14) 



n 



-w + \/w 2 + 8ag 
4a 



(15) 



The first order correlation matrix takes the form 

f q ... 0\ 

q q 

q q 

C 







q q 



(16) 



where q = —ag/(w + Ian) 2 [Eq. 



\0 ... q / 
Using Eq. (TI5T) . one obtains 



4ac/ 



(w + ^u; 2 + 8ag) 5 



(17) 



Since the parameters a, g and w are positive, it is easy to see that q takes values only in the 
range — 1/2 < q < 0. This fact will be used in the analysis below. 

To obtain the complete correlation matrix, one has to solve Eq. Q. In the case of a 
linear metabolic network it takes the form 



Gii 



q(G 



G 



(i ^ h 



(18) 



Based on the symmetry of the problem, it is clear that for a given choice of the parameters, 
the correlation between the species X\ and Xj depends only on the distance d = \j — i\ 
between them. Using this indexation, Eq. ffl8|) becomes 




d-l\ 



(d>l) 



(19) 



d. 



where Gd is the correlation matrix term for pairs of species X, and Xj where \j — i 
Since the correlation is expected to decay exponentially as a function of the distance between 
the nodes, we search for a solution of the form Gd = exp(-kd). Inserting this expression 
into Eq. ([191) we obtain two possible solutions of the form 



k = ln(x ± V x 2 — 1), 



(20) 



where x = l/2q. Since the parameter q is limited to the range — 1/2 < q < 0, the parameter 
x can take values only in the range — oo < x < — 1. The physically relevant solution must 
satisfy the condition that the correlation between very distant species will vanish. This 



constraint requires that \x ± \J x 2 — 1| > 1. To satisfy this condition for — 1/2 < q < 0, one 
has to choose the solution where the square root is subtracted. The result is 



k = In 



x 



+ in, 



(21) 



where i = The correlation between species as a function of the distance d between 

them is thus 



G n 



-l) d exp 



In 



X 



x- 



d 



(22) 
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The pre-factor of the exponent accounts for the fact that since q < 0, the correlations 
between directly interacting species are negative. Thus, pairs of species which are next- 
nearest neighbors in the network tend to have positive correlations between them. The 
correlation function [Eq. ffTUj) ] is the absolute value of Gd, which comes to be 

F COI (d) = e'i, (23) 

where 

) _1 (24) 

is the correlation length of the network. It is interesting to examine the limit in which 
q — > 0. In this limit the correlations are weak and the typical correlation length converges 
to do = — l/ln|g|. The correlation function approaches F cor (d) ~ \q\ d . In this limit, the 
correlation between a pair of species is dominated by the shortest path between them. For 
each step along that path, the correlation is multiplied by a factor of q. Thus, the magnitude 
of the correlation between a pair of species at distance d from each other is approximated 
by \q\ d - 

One can identify two limits. In the limit where ag ^> w 2 the correlations are strong, 
q — > — 1/2 and <i — » oo. In this limit, the reaction process is dominant and long range 
correlations are observed. In the limit where ag <C w 2 , the correlations are weak, and 
q — > 0. In this limit the degradation process is dominant and the correlation length is small. 
In Fig. [2] we present the correlation length, do, as a function of the parameters a, w and g for 
a linear metabolic network. The correlation length increases with a and g (as the reaction 
process becomes dominant), and decreases with w (as the process of degradation becomes 
dominant). 

Using Eqs. ( TTTj) and (fl3"|) . the connectivity rj can be expressed by rj = 3do/J. The linear 
network clearly demonstrates the difference between the concepts of TSW networks and 
FSW networks. In the topological sense it is as far as a network can be from a small world 
network, as the distance scales linearly with the network size, and the clustering coefficient 
is zero. However, in the functional sense the linear network can be a small world network, 
when the reaction terms are sufficiently dominant, enabling do to become larger than J. 

In order to examine the theoretical predictions of the method, we conducted a simulation 
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of the long linear metabolic network described above. In this simulation we constructed 
a linear network of J = 100 reacting species with periodic boundary conditions, namely, 
X reacts with X 99 . At time t = we assigned to each reacting species its steady state 
concentration rij. Then we forced the concentration no to be slightly above its steady state 
value, namely n (t) = n + Ano, where An <C no- We then let the network relax to 
its new steady state. We denote the resulting change in the steady state concentration of 
the species X{ by Arij. In Fig. [3] we show the absolute value of An ; /Ano a s a function 
of d, the distance of the node Xi from the perturbed node, X . These results, obtained 
from direct integration of the rate equations, are shown for different values of the reaction 
rate a (symbols). When a increases the typical correlation length becomes higher, and the 
effect of the local perturbation of X extends to more distant species. The results are in 
good agreement with the theoretically derived correlation function, F cor (d) [Eq. (1231) ] (solid 
lines). Slight deviations appear for distant species. This is because in numerical simulations 
one must choose Ano to be a finite perturbation. The resulting deviation in the rest of 
the species is thus affected by higher order terms in the Taylor expansion which are not 
accounted for by our method. Here the generation rates and the degradation rates of all 
the species are g = 1 and w = 1 respectively. The network becomes an FSW network once 
do > 17, which is approximately the average path length for this network. This condition is 
satisfied for a > 2 x 10 5 . 

B. Perfect Tree Network 

Hierarchical structures are common in realistic networks. For instance, ecological net- 
works have in many cases distinct trophic levels. Social organizations are also constructed 
in a tree-like framework. Here we relate to a hierarchical metabolic network. Consider a 
metabolic network of J nodes where each node is assigned a level I (I = 0, . . . ,N). The 
highest level I = N consists of a single node, referred to as the root. Each node at level I is 
then connected to exactly one node at level / + 1 (the parent) and m nodes at level / — 1 (the 
siblings). The parameter m is defined as the order of the tree. The degree of all the nodes 
(except those at the levels zero and N) is thus r = m + 1 (Fig. HJ). Since this network is 
hierarchical, the up and down directions are well defined. Stepping from a node at level / to 
a node at level I + 1 will be considered going up the network, while stepping from level I to 
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level Z — 1 is going down the network. Note that in a tree-like network it is not possible to go 
sideways, as there is no edge connecting two nodes at the same level. Consider a species Xi, 
which is at a distance d from some other species Xj. The path between them consists of u 
steps up the network and v steps down the network. The total distance satisfies d — u + v, 
and the path between them can be noted by d = (u,v). For example, the path between 
the two shaded nodes in Fig. [4] is d = (2, 3) and the distance is d = 5. Two species are 
said to be located in the same branch if in the path between them either u = or v = 0. 
The reaction rate matrix and the first order correlation matrix have non-zero values only 
for directly interacting species, namely, for pairs of species where either u = 1 and v = 0, or 
v = 1 and u = 0. 

In order to avoid the complexities related to the boundaries of the network, we consider 
the case in which N 3> 1. For simplicity, we take the generation and the degradation rates 
to be <7j = 1 and di = 1 for i = 1, . . . , J. The reaction rate is = a for each pair of nodes 
% and j that react with each other. Under these conditions, the network is symmetrical and 
the rate equations are identical for all nodes: 



^ = 1 - n {t) - ran 2 {t). (25) 



The steady state solution is thus 



-l + y/T+4^ 

71 = 2^ ' (26) 

and the non-zero elements in the first order correlation matrix [Eqs. (jl]) and (j5J)] are 

Q = o . (27) 

(1 + VI + 4ra) 

Two limits are observed. In the limit of strong interactions, where a ^> 1, the matrix 
elements approach q ~ — 1/r. In the limit of weak interactions, where a <C 1 one obtains 
q ~ —a. In any case the values that q can take are limited to — 1/r < q < 0. 

For an infinite perfect tree with uniform rate constants, the correlation between all pairs 
of species with that same values of u and v are the same. We denote this correlation by 
G UyV . In each line i of the first order correlation matrix there are exactly r non-zero terms. 
One term for X^s parent and m terms corresponding to Xj's siblings. The correlation G UjV 
between two species Xi and Xj is thus carried via the the parent of the species JQ, for which 
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the correlation with Xj is G u -i, v , and via the m siblings of the species Xi, for which the 
correlation with Xj is G U+ \ )V . Eq. Qj thus takes the form 



Gt),o — 1 

G 0>v = q [G Q)V+1 + G ,«_i + (m- 1)G 1>V \ for v > 
= q(G u -. ljV + mG u+ljV ) for it > 



(28) 



The first equation states that the correlation of every species with itself is unity. The second 
equation accounts for the correlations between species at the same branch, measuring the 
effect of variation in the higher level node on a node at a lower level. The third equation 
accounts for all the correlations that are not included in the first two equations. More 
specifically, it includes the correlations between species from different branches. It also 
includes the correlations between pairs on the same branch, measuring the effect of variation 



in the lower level node on a node at a higher level. We seek a solution of the form G uv = e 



-k-d 



where k = fc 2 ) satisfies the condition that correlations vanish between distant species. 
From the third equation one obtains 



k\ = In 



- (l ± v 7 ! - 4mg 2 ) 



(29) 



while from the second equation one obtains 



ko = In 



2q 



x 



(m — l)q 



x 




[m — l)q 



x 



-4g 2 



(30) 



where x = e kl . In order to satisfy the conditions that G UyV does not diverge for u ^> 1 while 
— 1/r < q < 0, one has to choose the solution with the plus sign for k\ in Eq. (129]) . The 
same condition for w> 1 requires one to choose the solution with the plus sign for fc 2 i n Eq. 

After some algebraic manipulations it can be shown that k\ = k2- The correlation between 
any pair of species is thus 



G = e i7rd e _H o 

where d — u + v is the distance between the two species, and 



(31) 
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do = ^ln 

is the correlation length of the tree-like network. The correlation function is F COT (d) = 
exp(—d/do). Note that for r = 2 (m = 1) this solution coincides with the solution obtained 
for the linear network [Eq. (I24p ]. In the limit of weak interactions, where a < 1 and q — > 
the correlation function approaches F cor (d) ~ \q\ d . In this limit, due to the weak interactions, 
the correlation between a pair of species is dominated by the shortest path between them. In 
the limit of strong interactions, where a ^> 1 and q — > — 1/r, the correlation length satisfies 
do — > l/ln(m). For m > 1 the correlation length is always finite. Since the average path 
length of a perfect tree-like network must scale in some form with the number of levels in 
the tree, one obtains that for a large enough tree network the connectivity will always be 
less than unity. Thus a perfect tree-like network of order m = 2 or more will never be an 
FSW. In Fig. [5] we show the correlation length do as obtained for a metabolic network with 
a perfect tree topology vs. the reaction rate a (symbols). The results are shown for trees of 
different orders. Here g = d = 1, and a is varied. 



1 — Amq 2 



2q 



(32) 



IV. MORE COMPLEX NETWORKS 



To demonstrate the applicability of the NCF method, we now refer to the analysis of a 
set of more complex networks. Here analytical solutions are not available, and the correla- 
tion matrix must be obtained numerically. We analyze three different topologies following 



the structural classification proposed by Estrada [33|. The first example represents a class 
of networks which are organized into highly connected modules with few connections be- 
tween them. The second example will be of a network with a highly connected central core 
surrounded by a sparser periphery, and the last example will be of a scale-free network. 

Consider a network constructed of three fully connected modules (communities), with 
a single connection between each pair of communities. This network is displayed in Fig. 
E^a). Here, each community consists of 13 nodes, adding up to a total of J = 39 nodes. 
To obtain, rij, i — 1, . . . , 39, the steady state solution for the concentrations of the different 
reacting species we solve Eq. (T2J) using a standard Runge-Kutta stepper. The parameters 
we use are g.i = 1 and di = 1. The reaction rate a between pairs of reacting species is 
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also set to unity. We then construct the first order correlation matrix, Cij, as appears 
in Eq. (jSJ). The complete correlation matrix, Gij, is obtained from Eq. (Q. It consists 
of a set of 39 x 39 linear algebraic equations. Solving these equations, one obtains the 
complete correlation matrix of the network. For this network, the main insight on the 
global functions of the network can be deduced from the complete correlation matrix, which 
is displayed in Fig. [6(b). The diagonal terms, which are all unity, are omitted from the 
Figure. As expected, strong correlations appear between species within the same community 
(sub-matrices along the diagonal), and vanishingly small correlations appear between species 
from different communities. In fact, the correlation matrix is close to be a partitioned block 
matrix, except for a few coupling terms between the blocks. In this case the correlation 
matrix reflects the topological structure of the network, which is almost fully partitioned 
into three isolated communities. 

We now consider a network, which features a highly connected central core surrounded by 
a sparser periphery. This network consists of J = 40 nodes. The nodes Xi, i = 16, . . . , 25, 
are a fully connected cluster (the core), while the 30 additional nodes are connected to all 
the nodes in the core, but not to each other (the periphery). This network is shown in Fig. 
[7(a). Following the same procedure described above one obtains the correlation matrix for 
this network [Fig. [7J(b)]. The central square (domain I) shows the correlations between the 
nodes in the central core. Domains II show the correlations between peripheral nodes and 
central ones. The value of these correlations is high, expressing the strong dependence of 
the peripheral nodes on the nodes in the central core. On the other hand for the corre- 
lations between the central nodes and the peripheral ones (domains III) one obtains very 
low correlations. This is an expected result, as deviations in the population of a node from 
the periphery should have almost no effect on a node from the core. An interesting result 
appears in domains IV. These domains show the correlations between pairs of nodes that are 
both from the periphery. It turns out that the effect of these nodes on each other is stronger 
than the effect they have on their adjacent nodes from the core. This is even though the 
topological distance between peripheral nodes is d = 2, while the distance between them 
and the central nodes is d — 1. A small perturbation in a peripheral node results in a very 
minor effect on all the central nodes. However this minor change in the core results in a more 
dramatic effect on all the rest of the peripheral nodes. This non-trivial result exemplifies 
the importance of the functional methodology as a complimentary analysis to the common 
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topological approach. In the two examples shown above, we focused on the insights provided 
by the complete correlation matrix. Below we show an additional numerical example, where 
we continue the analysis to obtain the correlation length, do, and the connectivity rj. 

One of the common characteristics of many realistic networks is their degree distribution 
that follows a power law, namely P(k) = ak~ x , where a and A are positive constants H]- 
Ecological networks, social networks and metabolic networks are characterized by power-law 
degree distributions, and are referred to as scale-free networks. Such networks include some 
nodes, called hubs, with a degree which is orders of magnitude higher than the average 
degree in the network. Scale free networks are considered as highly connected, because due 
to these hubs the average path length between nodes is small. In fact, in metabolic networks 



the average path length was found to be as small as (d) ~ 3 3l|. Below we examine a scale 
free network which is a TSW network, and determine whether it is also an FSW network. 

To construct a scale free network we use the preferential attachment algorithm In 
this algorithm a single new node is added at each iteration and m edges are drawn from it 
to the set of existing nodes. The probability of linking the new node to some existing node 
Xi is proportional to the current degree of the node X\. This way, nodes which already have 
a higher degree than others have a high probability of obtaining more links and becoming 
hubs. Here we constructed a scale free network consisting of J = 75 nodes. The number of 
edges added in each iteration was m — 3. The result is the graph appearing in Fig. [HJ The 
diameter of this network is D = 4 and its average path length is (d) = 2.43. 

Solving Eq. (T5]) we obtain the steady state solution for the concentrations of all the 
reactive species. The parameters are gi = 1 and di = 1 for i = 1, . . . , 75. The reaction rate a 
between pairs of reacting species is varied. In this case, obtaining the complete correlation 
matrix, GV,-, requires the solution of 75 x 75 linear algebraic equations [Eq. (jUJ)]. We solve 
these equations and then average over the correlations between equidistant species to obtain 
the correlation function F COT (d) [Eq. (TlOl)]. In Fig. [9] we show the resulting correlation 
function F COT (d) vs. d for three different values of the reaction rate a (symbols). When the 
interaction is suppressed (a < 1) the correlations decay rapidly. When the interaction is 
dominant (a > 1), correlations are maintained over long distances. By fitting the correlation 
functions to exponential functions (solid lines) one obtains the typical correlation length, d Q , 
and the connectivity, r], of each of the networks. The results for t] vs. the reaction rate a are 
shown in Fig. [101 It is found that the connectivity increases logarithmically with a. Note 



1(3 



that for a very wide range of values of the parameter a, the connectivity remains lower than 
unity. This means that although the examined scale free network is a TSW, for a very wide 
range of parameters it is not an FSW. Only in the extreme cases of very strong interactions 
FSW behavior might emerge. 



V. SUMMARY AND DISCUSSION 



We have presented the NCF method for the analysis and evaluation of the connectivity of 
interaction networks. The method complements the topological analysis of networks, taking 
into account the functional nature of the interactions and their strengths. The method 
enables to obtain the correlation matrix, which provides the correlations between pairs of 
directly and indirectly interacting nodes. In certain cases, one may gain insights on the 
network's functionality by writing down the complete correlation matrix. For instance, one 
can identify domains of high and low correlations. In other cases it is more insightful to 
extract the macroscopic characteristics of the network from the matrix. In particular, we 
have shown how to calculate the typical correlation length of the network. This correlation 
length, which has to do with the functionality of the network, can be compared to topological 
characteristics such as the average minimum path length of the network. The ratio between 
these two lengths provides the characteristic connectivity of the network. It was shown that 
the topological analysis alone is not sufficient in order to characterize the functionality of 
a network. For instance, networks with small world topology may display low connectivity, 
while networks that do not exhibit small world topology may display high connectivity. This 
is because in terms of the functionality of the network, when the correlation length is large, 
even distant species may be highly correlated. We demonstrated the method for metabolic 
networks with different topological structures, and identified the regimes of low connectivity 
and of high connectivity. As expected, these regimes depend on topological features, such 
as the number of species or the average minimum path length between pairs of species. 
However, they also depend on functional features such as the type of interactions in the 
network and the rate constants of the different processes. 

The NCF method was demonstrated for metabolic networks, but its applicability is much 
wider. In fact, the method could be applied to any reaction network that can be modele d by 
rate equations. Such networks include metabolic networks 34] , chemical networks 35|, [36] , 
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gene expression networks 37|, |38( and ecological networks 39|. It is common to use rate 



equations for the modeling of these types of networks. In certain models of social networks, 
the flow of information as well as the spreading of viruses can also be described by rate 
equations. The method is not suitable for obtaining the correlations in Ising type models, 
where the nodes are assigned discrete variables, which cannot be modeled using continuous 
equations. The number of elements in the correlation matrix is equal to the number of pairs 
of nodes in the system. When applying the NCF method, one writes a single linear equation 
for each matrix element. Thus, from a computational point of view, the scaling of the NCF 
method is quadratic in the number of reactive species. This enables the application of the 
method to networks which include even thousands of nodes. It is straightforward to extend 
the application of the method to the other types of interaction networks mentioned above. 
A few examples are addressed below. 

Consider, for example, gene expression networks. These networks consist of genes and 
proteins that interact with each other. In addition to protein-protein interactions, already 
analyzed in the context of metabolic networks, genetic networks include transcriptional reg- 
ulation processes, where some genes regulate the expression of other genes. In recent years, 
much information has been acquired about the topology of these networks, for certain organ- 

]. The problem is that these networks are very elaborate, 



isms such as Escherichia coli 



and may consist of thousands of nodes. This limits our ability to simulate their functionality, 



and thus, currently most o 
focused on small modules 



the theoretical and computational analysis of these networks is 
381 ] . In this analysis, one performs simulations of small subnet- 
works consisting of only a few nodes. These subnetworks are expected to play specific roles 
in the functionality of the network as a whole. Such approach is valid if an isolated module 
maintains its function when incorporated in a large network in which it interacts with many 
other genes. We expect the analysis presented here to provide some insight on this matter. 
By obtaining the complete correlation matrix, one can characterize the dependence of differ- 
ent proteins and genes on one another. The network may then be divided into subnetworks, 
grouping together nodes that are highly correlated, and excluding ones that are not. It is 
expected that these modules will not function significantly differently when analyzed in the 
context of the surrounding network nodes. In addition, the typical correlation length will 
provide us with an approximate radius beyond which correlations may be neglected. To 
simulate a module properly, one needs to include all the nodes which are within that radius 
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from the module. Other possible applications regard social networks. For instance, the 



process of viral spreading could be analyzed 
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4=11 ] . Many social networks are known to 



be small world networks. However, this does not mean that any contagious disease spreads 
rapidly. This is, possibly, because for certain diseases the correlation length is small. Using 
the method presented here, one can obtain this correlation length, taking into account the 
specific rate constants of the viral flow. 

The recent applications of graph theory to many natural macroscopic systems was enabled 
by focusing on their topology. This approach has been very fruitful, as it uncovered the 
mutual structure of networks from many different fields. In particular, the ubiquity of the 
scale free degree distribution, and the small world topology was found. However, it still is 
not completely clear what functional meaning can be given to these topological properties 



in different contexts. A recently proposed ap_ 



Droach derives the key aspects of the network 



functionality from its topological structure 42j . Other approaches use the Ising Hamiltonian 



;o describe the interaction pattern between nodes on scale free and small world networks 
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44| . Functional characteristics such as phase transitions, and critical exponents are then 



observed. The NCF method presented in this paper complements these approaches. It can 
be applied to a variety of different interaction processes, such as metabolic, ecological or 
social interactions, all of which can be described by rate equations. We believe that the 
approach presented here will lead to new insights on the behavior of networks and their 
functionality. 
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FIG. 1: The linear metabolic network. Each molecular species Xi reacts with its two nearest 
neighbors, and Xi + \. 
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FIG. 2: The correlation length, do, of the linear metabolic network versus the generation rate, g 
(a); the degradation rate, w (b); and the reaction rate, a (c). High connectivity is reached when 
the primary process is the reaction process (proportional to g and a). The correlation length do 
decreases with increasing w (as the degradation becomes dominant). 
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FIG. 3: (color online) The correlation function F cor (d) for the linear metabolic network as obtained 
from a numerical simulation for different values of the interaction rate a (symbols). To conduct the 
numerical test we integrate the equations for the linear network and bring them to the steady state 
condition. Then we force a small perturbation Auq on the concentration no of the species Xq. We 
evaluate the correlation function using F CQV (dj — | Audi Atiq | . The correlations decay exponentially 
with the distance between species. The typical correlation length increases as the reaction rate is 
increased. The results are in agreement with the theoretical results of Eq. f)23|) (solid lines). Slight 
deviations appear due to the fact that in numerical simulations Arao must be finite. 
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FIG. 4: A tree-like network with N = 5 levels. Each node is linked with exactly one node at the 
level above it (father), and m nodes at the level below (siblings). The top node (here at level I = 5) 
is the root node. The order of the tree is m = 2, and the degree of the nodes is r = 3. The path 
between a pair of nodes is characterized by the number of upward steps followed by the number 
of downward steps to get from one node to the other. For the path between the two shaded nodes 
d= (2,3). 
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FIG. 5: (color online) The correlation length do versus the reaction rate a for a metabolic network 
with a perfect tree structure. The results are shown for trees of different order, m (symbols). 
For m = 1 the results coincide with those obtained for the linear network. For higher orders 
the correlation length is bound from above by d nax = — l/ln(m) (gray horizontal lines). Thus, 
for a sufficiently large tree-like network, where the average path length is larger than d™^, the 
connectivity is always less than unity. Tree like networks are thus not expected to display FSW 
behavior. 



26 




FIG. 6: (color online) (a) A network constructed of three fully connected modules, with single 
bonds between them, (b) The correlation matrix features high correlations within the modules, and 
very small correlations between pairs of nodes from different modules. The matrix is constructed of 
three almost uncoupled blocks, reflecting the near bipartite topology of the network. The diagonal 
terms, which are all unity, do not appear in the Figure. 
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FIG. 7: (color online) (a) A network consisting of a dense fully connected core with a sparse 
periphery. The peripheral nodes are each connected to all the central ones, but not to each other, 
(b) The correlation matrix shows strong correlations between pairs of species from the core (domain 

I) . The strongest dependence is between nodes from the periphery to nodes from the core (domains 

II) . However nodes from the core are almost not affected by nodes from the periphery (domains 

III) . Interestingly, the correlations between pairs of nodes from the periphery are not so low, even 
though they are not directly connected to one another (domains IV). The diagonal terms, which 
are all unity, do not appear in the Figure. 
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FIG. 8: Scale free network consisting of 75 reacting species, constructed using the preferential 
attachment algorithm. The average path length of this network is (d) = 2.43 and its diameter is 
D = 4. 
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FIG. 9: (color online) The correlation function vs. distance as obtained for the scale free network 
appearing in Fig. [8] for different values of the parameter a (symbols). The correlations decay 
rapidly for low values of a, and more gradually for large values of a. By fitting the correlation 
function to an exponential (solid lines), the correlation length do can be obtained. 
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FIG. 10: The connectivity, r] vs. a as obtained for the scale free network shown in Fig. [HJ The 
connectivity increases logarithmically as a function of a. Note that 77 < 1 for a very broad range 
of values of the parameter a. This implies that although scale-free networks are commonly TSW 
networks, in the functional sense they may not be FSW networks. 
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